The widespread use of Session Initiation Protocol as a signalling protocol has created various challenges. An important one is that its throughput can be severely degraded when an overload happens in the proxy server because of several retransmissions from the user agent. One common approach to overcome this problem is ‘load balancing’. A balancer needs to know the status of proxy servers, which are continuously gathered implicitly or explicitly. Implicit methods have averagely less overhead than explicit ones. This paper attempts to prevent throughput reduction by balancing the loads among available proxy servers properly using an implicit mechanism called History Weighted Average Response time. The proposed algorithm is robust because it incurs no extra processing to proxy servers. The novelty of the mechanism is making use of ‘response time history’ to estimate the load being currently processed on servers. By implementing in a real testbed, throughput and scalability are improved compared with an important state-of-the-art similar algorithm. This improvement stems from no need for modification in SIP protocol, easy implementation and application, simple computations for making decision and no need for extra feedback between servers and load balancer. Copyright © 2015 John Wiley & Sons, Ltd.